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A STOCHASTIC ANALYSIS OF AUTOREGULATION OF GENE 

EXPRESSION 

RENAUD DESSALLES, VINCENT FROMION, AND PHILIPPE ROBERT 


Abstract. This paper analyzes, in the context of a prokaryotic cell, the sto¬ 
chastic variability of the number of proteins when there is a control of gene 
expression by an autoregulation scheme. The goal of this work is to estimate 
the efficiency of the regulation to limit the fluctuations of the number of copies 
of a given protein. The autoregulation considered in this paper relies mainly on 
a negative feedback: the proteins are repressors of their own gene expression. 
The efficiency of a production process without feedback control is compared 
to a production process with an autoregulation of the gene expression assum¬ 
ing that both of them produce the same average number of proteins. The 
main characteristic used for the comparison is the standard deviation of the 
number of proteins at equilibrium. With a Markovian representation and a 
simple model of repression, we prove that, under a scaling regime, the repres¬ 
sion mechanism follows a Hill repression scheme with an hyperbolic control. 
An explicit asymptotic expression of the variance of the number of proteins 
under this regulation mechanism is obtained. Simulations are used to study 
other aspects of autoregulation such as the rate of convergence to equilibrium 
of the production process and the case where the control of the production 
process of proteins is achieved via the inhibition of mRNAs. 
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1. Introduction 


1.1. Biological Context. The gene expression is the process by which genetic 
information is used to produce functional products of gene expression: proteins 
and non-coding RNAs. This paper concerns itself with the production of proteins. 
The information flow from DNA genes to proteins is a fundamental process. It is 
composed of three main steps: Gene Activation, transcription and translation. 

(1) The initiation of transcription is strongly regulated. Schematically the 
gene is said to be in “inactive state” if a repressor is bound on the gene’s 
promoter preventing the RNA polymerase from binding and is in “active 
state” otherwise. 
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(2) When the gene is in active state, the RNA polymerase binds and initiates 
transcription that leads to the creation of a mRNA, a copy of a specific 
DNA sequence. 

(3) The translation of the messenger into a protein is achieved by a large com¬ 
plex molecule: the ribosome. A ribosome binds to an active mRNA, initi¬ 
ates the translation and proceeds to protein elongation. Once the elongation 
terminates, the protein is released in the medium and the ribosome is anew 
available for any another translation. 

The production of proteins is the most important cellular activity, both for the 
functional role and the high associated cost in terms of resources. In a E. Coli 
bacterium for example there are about 3.6 x 10® proteins of approximately 2000 
different types with a large variability in concentration, depending on their types: 
from a few dozen up to 10®. The gene expression is additionally a highly stochas¬ 
tic process and results from the realization of a very large number of elementary 
stochastic processes of different nature. The three main steps are the results of a 
large number of encounters of macromolecules following random motions, due in 
particular to thermal excitation, in the viscous fluid of the cytoplasm. One of the 
key problems is to understand the basic mechanisms which allow a cell to produce 
a large number of proteins with very different concentrations and in a random con¬ 
text. This can be seen as a problem of minimization of the variance of the number 
of proteins of each type. 

To study this problem, one can take a simple stochastic model, with a limited 
set S of parameters preferably, describing the three steps of the production of a 
given type of protein. Once a closed form expression of the variance of the number 
of proteins is obtained, it is natural to And the parameters of the set S which 
minimizes the variance with the constraint that the mean number of proteins is 
fixed. See the survey Paulsson ^5] . 

A more effective way to regulate the number of proteins can be of using a direct 
feedback control, an autoregulation mechanism, so that the production of proteins 
is either sped up or slowed down depending on the current number of proteins. 
It should be noted that the feedback control loop can involve other intermediate 
proteins to achieve this goal, like the classical lac operon, but it is not considered 
here. See Yildirim and MacKey (3^1 for example. 

The protein can regulate the gene activation simply, for example by being a 
repressor and tend to bind on his own gene’s promoter. This is the autogenous reg¬ 
ulation scheme. See Goldberger m and Maloy and Stewart [2T]. See also Thattai 
and van Oudenaarden m- Other autoregulation mechanisms are possible in cells, 
such as an autoregulation on the mRNAs where a protein inhibits its own trans¬ 
lation initiation by binding to the translation initiation region of its own mRNAs. 
It occurs for example in the production of ribosomal proteins, see Kaczanowska 
and Ryden-Aulin m- The idea being that a feedback mechanism may reduce 
significantly the number of large excursions from the mean. In this paper, the 
mathematical analysis will mainly focus on a negative autogenous feedback, when 
the rate of inactivation of the gene expression grows with the number of proteins. 

1.2. Literature. The classical results concerning the mathematical analysis of 
the variance of the number of proteins has been investigated in Berg [3] and 
Rigney [571 [2H] and reviewed more recently by Paulsson |^, see also Raj and 
van Oudenaarden [^ for the biological aspects. These references use the three 
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stage model, the state of the system is given by three variables: the state of the 
promoter, the number of mRNAs and the number of proteins. Mathematically, 
the techniques used rely on the Fokker-Planck equations of the associated three 
dimensional Markov process and the observation that at equilibrium, a recurrence 
on the moments of the number of proteins holds. Fromion et al. [9] investigates a 
more general model (elongation times are not necessarily exponentially distributed 
in particular) and an alternative technique to a Markovian approach is introduced. 

Concerning the evaluation of autoregulation, most of mathematical models use a 
continuous state, the rate of production of proteins depends linearly on the number 
of mRNAs and the rate of production of mRNAs is a function k{p) exhibiting a 
non-linear dependence on the current number p of proteins. In Rosenfeld et al. [31] 
and Becskei and Serrano [5], based on experiments the constant k{p) is taken a Hill 
repression function, i.e. k{p) = a/(b + p^) for some constants a and b and n > 1 is 
the Hill coefficient. See also Thattai and van Oudenaarden m- Related models in a 
similar framework with further results are presented in Bokes et al. and Yvinec et 
al. [40]. For most of these models the state of the promoter, active or inactive, which 
is a source of variability is not taken into account, it is in some way encapsulated in 
the constant k{p) whose representation is rarely discussed. In Hornos et al. [H] the 
state of the gene expression, on or off, is taken into account but not the number of 
mRNAs and therefore the fluctuations generated by transcription. The parameter 
of activation k(p) is of course crucial in our case since autogenous regulation rely on 
the state of the promoter which can be inactivated by proteins. Our model includes 
it. See also Fournier et al. |S] for some simulations of these stochastic models of 
autoregulation as well as some experiments. 

1.3. Results of the Paper. The main goal of this paper is to estimate the possible 
benefit of the autogenous regulation to control the fluctuations of the number of 
copies of a given protein. The efficiency of a production process without feedback 
control is compared to a production process with an autoregulation of the gene 
expression, assuming that both of them produce the same average number proteins. 
The main characteristic used for the comparison is the standard deviation of the 
number of proteins at equilibrium. For this purpose, two approaches are used. 

Mathematical Analysis. One first studies the distribution of the number of 
proteins via a stochastic model. When there is no regulation, the corresponding 
classical mathematical model has been investigated in detail for some time now. In 
particular, the standard deviation of the number of proteins at equilibrium has a 
closed form expression in terms of the basic parameters of the production process. 
See for example the survey Paulsson [^ , and also Fromion et al. |5| . 

To represent the negative feedback of the autogenous regulation, a simple model 
is used: each protein can be bound, at some rate and for some random duration of 
time, on its own gene expression. In this situation the gene expression is inactive 
and the transcription is not possible during that time. This amounts to say that 
the gene expression is deactivated at a rate proportional to the number of proteins. 
The activation rate is constant. 

As will be seen, the mathematical model of the autogenous regulation is more 
complicated, in particular there is no recurrence relationship between the moments 
of the number of proteins at equilibrium as in the classical model of protein pro¬ 
duction process. For this reason, a limiting procedure is used, it amounts to assume 
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that the dynamics of the activation of the gene expression and of the evolution of 
mRNAs occur on a much faster time scale than the dynamics of the proteins. The 
values of the key parameters are presented in Section [5.1[ The scaling parameter is 
the multiplicative factor describing the difference of speed of these two time scales. 
The main convergence result is Theorem The assumption of a fast time scale 
for gene expression activation and mRNAs is quite common in the literature, see 
Bokes et al. [5] and Yvinec et al. [ID]. The techniques used in these references 
rely on singular perturbation methods to deal with the two time scales. In our 
setting, a probabilistic approach is used, as will be seen, it gives precise results on 
the asymptotic stochastic evolution of the number of proteins. 

Under this limiting regime it is shown that, asymptotically, the protein produc¬ 
tion process can be described as a birth and death process. See Keilson [T7] for 
example. In state a; S N, the birth rate is given a/(b + x) for some constants a and 
b. This is a contribution of the paper that, with a simple model of the autoregula¬ 
tion, one can show that the repression mechanism follows indeed a Hill repression 
scheme with an hyperbolic control, i.e. with Hill coefficient 1. The death rate is not 
changed by the limiting procedure, it is proportional to x. Consequently, one can 
get an asymptotic closed form expression of the standard deviation of the number 
of proteins by using the explicit representation of the equilibrium of this birth and 
death process. See Corollary |4.1| It is shown that, in this limiting regime, the 
standard deviation is reduced by 30%. The corresponding results are presented in 
Section and Section]^ and in Appendix [A] The mathematical results are obtained 
via convergence theorems for sequence of Markov process, the proof of a stochastic 
averaging principle and a saddle point approximation result. 

Simulations. We also analyze, via simulations, autogenous regulation but also 
other aspects related to the regulation of protein production. This is presented in 
Section Simulations are used mainly because of the complexity of the mathe¬ 
matical models of some aspects of the autogenous regulation. By using plausible 
biological parameters, one gets an improvement of 15% for the standard deviation 
of the number of proteins can be expected. This is significantly less than the perfor¬ 
mances of the limiting mathematical model studied in Section The main reason 
seems to be that that the scaling parameter is not, in some cases, sufficiently large 
to have a reasonable accuracy with the limit given by the convergence result of 
Theorem [2| 

Via simulations, one also investigates the case when the regulation is not on the 
gene expression but on the corresponding mRNAs: a protein can block an mRNA 
for some time. In this situation, it could be expected that the production process 
is modulated more smoothly by playing on the inactivation of a fraction of the 
mRNAs and not on the rough on-off control of the gene expression. It is shown 
that the improvement is real but not that big (less than 10%). It is nevertheless 
remarkable that if the average life time of mRNAs is significantly increased, our 
experiments show that the benefit of such regulation can be of the order of more 
than 30% on the standard deviation of the number of proteins. 

Coming back to regulation on the gene expression. Our experiments show that, 
despite the impact of autogenous regulation on fluctuations of the number of pro¬ 
teins can be limited, it has nevertheless a very interesting property. Starting with 
a number of proteins significantly less (or greater) than the average number of pro¬ 
teins at equilibrium, the autogenous regulation returns to the “correct” number of 
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State of Gene mRNAs 


Proteins 



Figure 1. Classical Three Stage Model for Protein Production. 


proteins much faster than the classical production process without regulation. This 
is a clear advantage of this mechanisms to adapt quickly when biological conditions 
change due to an external stress for example. See Section |5.6[ This phenomenon 
has been observed, via experiments, in Rosenfeld et al. m- See also Camas et 
al. [5]. Finally Section 5.5 investigates the comparison of production processes with 
and without a feedback on the gene expression through the estimation of their 
respective power spectral density. 
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2. Stochastic Models of Protein Production 

We present the stochastic models used to investigate the protein production 
process. We will use the three step model describing the activation-deactivation 
of the gene, the transcription phase and the translation phase. Like in most of 
the literature, it is assumed that the various events, like the encounter of two 
macromolecules, occurring within the cell have a duration with an exponential 
distribution. We start with the classical model used in this domain since the late 
70’s by Berg [3] and Rigney [13 HH]. See also Thattai and van Oudenaarden m 
and Paulsson [15] . 

2.1. The Classical Model of Protein Production. 

(1) The gene is activated at rate and inactivated at rate A^. 

(2) If the gene is active, an mRNA is produced at rate A 2 . An mRNA is 
degraded at rate ^ 2 - 

(3) Given M mRNAs at some moment, a protein is produced at rate A 3 M. 
Each protein is degraded at rate 

The stochastic processes describing the protein production process are: I{t) the 
state of the gene at time t which is 0 if it is inactive and 1 otherwise. The number of 
mRNA at time t is M{t) and P{t) denotes the number of proteins at that moment. 
The process {I{t),M{t),P{t)) is Markovian with state space 

5 =^- {0,1} X 
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its transition rates are given by, if {I{t),M{t),P{t))={i,m,p) S S, 

{ {0,m,p) —)■ {l,m,p) at rate Af, {l,m,p) —)■ {0,m,p) at rate A^, 

(i, m,p) —>• (i, TO + l,p) A2i, (i, TO,p) —)■ (i, TO — l,p) /i2TO, 

{i,m,p) ^ {i,m,p+l) X^m, {i,m,p) ^ {i,m,p - 1) p^p. 

See Figure [2 This Markov process has a unique invariant distribution. An explicit 
expression of the distribution of P at equilibrium is not known but, due to the linear 
transition rates, the moments of P can be calculated recursively. In the following 
(/, M, P) will denote random variables whose law is invariant for (/(t), M{t),P{t)). 


Proposition 1. At equilibrium, the two first moments of P can be expressed by 


( 1 ) 


E(P) = 




A2 A3 


Xi + A^ P2 Ms 


(2) var(P) = E(P) (1 + 

V M2 + Ms 

A]^ A2A3 (A^ + A]^ + M2 + Ms) 

(Af + A]^) {p2 + Ms) (Af + A]^ + P2) (AJ'’ + A]^ + ps) 

See Paulsson [^, Shahrezaei and Swain [^, Swain et al. [35] and Fromion et 
al. |9] for example. 

2.2. A Stochastic Model of Protein Production with Autogenous Regu¬ 
lation. The regulation is done via proteins which can inactivate the gene corre¬ 
sponding to the protein. If there are P proteins at some moment then the gene is 
activated at a rate proportional to P. Compared to the above model, only the first 
step changes. 

(1) The inactive gene is activated at rate A^ and inactivated at rate Xf P 
otherwise. 

See Figure]^ For the sake of simplicity, we use the same notations A]^ and Aj" as 
for the classical model of protein production instead of AJ ^ and A^ for example. 
It should be noted that in our comparisons in Section [^ these quantities are not 
necessarily the same for these two models. 

The corresponding Markov process is denoted as (/_F(t), Pp(t)), its tran¬ 

sitions have the same rate as {I ft), M if), P{t)) except for those concerning the first 
coordinate. 

0,m,p) —>■ (l,TO,p) at rate A^, {l,m,p) —)■ {0,m,p) at rate Xfp. 

As before, {Ip, Mp, Pp) will denote random variables whose law is the invariant 
distribution of the Markov process {Ip{t), Mp{f), Pp{f)). The following proposition 
is the analogue of Proposition[2for the feedback model but with unknown quantities 
related to the the activity of the gene, M{Ip), and the correlation of the activity of 
the gene and the number of mRNAs, E (IpMp). 

Proposition 2. At equilibrium, the first moment of Pp can be expressed by 

( 3 ) E{Pp)=E{Ip)^^. 

M2 M 3 






AUTOREGULATION OF GENE EXPRESSION 


7 



Figure 2. Three Stage Model for Protein Production with Auto¬ 
genous Regulation 


Proof. By equality of input and output for {M(t)) and (P{t)) at equilibrium, one 
gets the relations 

A 2 E {Ip) = ^ 2 ® {Mp) , A 3 E {Mp) = (Pp ), 
and therefore Relation (|^. □ 

It does not seem that an expression for E(/^) can be obtained, the relation 
XfE{IpPp) = A^ {l — E{Ip)) of equality of flows for activation/deactivation process 
introduces the correlation between Ip and Pp. This is in fact the main obstacle to 
get more insight on the fluctuations of the number of proteins. The next section 
investigates a scaling where the activation/deactivation phase is much more rapid 
than the production process of proteins. 

3. A Scaling Analysis 

It has been seen in the previous section that, for the feedback mechanism, an 
explicit representation of the variance of the number of proteins at equilibrium 
seems to be difficult to derive. In this section we use the fact that the time scale of 
the first two steps, activation/deactivation of the gene and production of mRNAs 
is more rapid than the time scale of protein production. This is illustrated by the 
fact that the lifetime of an mRNA is of the order of 2mn. whereas the doubling 
time of a bacteria is around 40mn giving a lifetime of a protein of the order of one 
hour. See Taniguchi et al. [36], Li and Elf [19] and Hammar et al. [13]. As will be 
seen, this assumption simplifies the analysis of the feedback mechanism. We will 
be able to get an asymptotic explicit expression for the distribution of the number 
of proteins at equilibrium. 

A (large) scaling parameter N is used to stress the difference of time scale. When 
there is a feedback control, an upper index N is added to the variables so that the 
corresponding Markov process is denoted as (Xp(t)) = {Ip (t), Mp (t), Pp {t)) on 
the state space S = {0,1} x The transition rates of the Markov process are 
given by 

{ (0,m,p) —)• (l,TO,p) at rate XfN, {l,m,p) —>■ (0,TO,p) at rate XfNp, 
{i,m,p) ^ {i,m + l,p) iX 2 N, (i, m,p) —)■ (i, m — l,p) p 2 'mN, 

(i,TO,p)(i,m,p-|-1) A 3 TO, ( 7 ,m,p)(i, m,p-1) p^p. 

The initial state is constant with N given by Xp (0) = {io,mo,Po) & S. 
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The aim of this section is of proving that the non-Markovian process {Pp {t)) 
converges in distribution to a limiting Markov process {Ppit))- As will be seen, 
an averaging principle, proved in the appendix, holds: locally the “fast” process 
{Ip (t), Mp (t)) reaches very quickly some equilibrium depending on the current 
value of the “slow” variable Pp (t). It turns out that the equilibrium of this limiting 
process {Ppit)) can be analyzed in detail. The proof of the averaging principle 
relies on stochastic calculus applied to Markov processes in the same spirit as in 
Papanicolau et al. [24] in a Brownian setting, see also Kurtz [18]. 


Notations. Throughout the rest of this paper, we will use the following notations 
Pi = A+/A;f and, for i = 1, 2, pi = Xi/pi. 


3.1. Scaling of the Classical Model of Protein Prodnction. One first states 
a scaling result for the classical model of protein production. The result being 
much simpler to prove than the corresponding result. Theorem [^ for the feedback 
process, its proof is skipped. One denotes by {X^{t)) = P^(t)) the 

corresponding Markov process, its transition rates are the same as for feedback in 
Relation @ except for deactivation: 

(l,TO,p) —)■ (0,m,p) at rate X^N. 

The following result shows that, in the limit, the evolution of the number of proteins 
converges to the time evolution of an M/M/oo queue. See Chapter 6 of Robert [29] 
for example. 


Theorem 1. If X^{0) = {io,mo,po) S S, the sequence of processes {P^(t)) eon- 
verges in distribution to a birth and death process {P{t)) on N whose respective birth 
and death rates {(3x) and (Sx) are given by 


Px 


X 3 P 2 P 1 
Pi + 1 


and 6x = Psx. 


The equilibrium distribution of {P{t)) is a Poisson distribution with parameter 
PlP2P3/(l +Pi)- 


Proof. The intuition of this result can be described quickly as follows. The processes 
{I^{t), (t)) live on a much faster time scale than {P^(t)) and therefore reach 

quickly the equilibrium. When N gets large, the process {M^{t)) is an M/M/oo 
queue with arrival rate A 2 Af/(A^ + A^) and service rate p 2 - See Chapter 6 of 
Robert |29j for example. Its equilibrium distribution is therefore Poisson with 
parameter pipi/{\ + pi). The process {P^(t)) can then be seen as an M/M/oo 
queue with arrival rate A3P2Pi/(l + Pi) and service rate ps, i.e. a birth and death 
process with the transition rates of the theorem. Its equilibrium is Poisson with 
parameter piP2P3/(l + Pi)- 

The proof of a corresponding result in a more complicated setting, for the pro¬ 
duction process with feedback, is done below. For this reason the proof of this 
result is skipped. □ 


3.2. Scaling of the Production Process with Feedback. The following theo¬ 
rem is the main result of this section. As in the case of the classical model of protein 
production, it relies on the fact that, due to the scaling, the activation/deactivation 
of the gene and the production of mRNAs occur on a fast time scale so that an 
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Figure 3. Feedback Model with Scaling Parameter N 


averaging principle holds. See below. Some of the technical results used to establish 
the following theorem are presented in the Appendix. 

Theorem 2 (Hill Repression Scheme). If Xp{0) = {iQ,mo,Po) € S, the sequence 
of processes {Pp (t)) converges in distribution to a birth and death process {PF{t)) 
on N whose respective birth and death rates (fdx) and (Sx) are given by 

a A3P2PI , c 

Px = -,— and bx = PsX, 

Pi+x 

with Pi = A)*" /X[ and p 2 = X 2 /P 2 - 

Proof. If / is a function on N with finite support then 


Vf{t) f{P^{t)) - f{po) 

-[ X 3 M^{u)A+{f){P^{u)) du- f p 3 P^{u)A~{f){P^{u)) du, 

Jo Jo 

is a local martingale. See Rogers and Williams [3D] for example. The operators A+ 
and A“ are defined as follows, for a real-valued function / on N, 

= fix + 1) - fix) and A“(/)(a;) = fix - 1) - fix), a; G N. 

With a similar method as in the proof of Assertion 1) of Lemma in the appendix 
and by using the criterion of the modulus of continuity, see Theorem 7.2 page 81 
of Billingsley |4] , it is easy to show that the two processes 

X3M^iu)A+if)iP^iu))du^ and psPp iu)A-if)iP^iu)) du^ 

are tight. Because of the tightness of iPp it)) of Proposition]^ of the appendix, one 
can take (Aj,) a subsequence such that the process 

(Pp^ it), J* X^M^- iu)A+if)iP^- iu)) du, J* psPp'^ in) A-(/)(P^'‘ (u)) du^ 
converges in distribution. 

Let (Ppit)) be a possible limit of (P^''(t)), then by continuity of the mapping 
izit)) i-G z(u)A"(/)(z(u)) du 
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on I?([0,T]) endowed with the Skorohod topology then, for the convergence in 
distribution 


hm f P^^{u)A-{f){P^^{u)) du) 

fe-y+oo \ Jo ) 


= fpf(t), [' Pf{u)A-{ f){PF{u)) dw ) . 


For t < T, by using the definition of and of £t in Section A.2 of the Appendix, 
one has the relation 

f AIp>‘{u)A+{f ){Pp^{u)) du= ( mA+(/)(p)l[o,t](u) A^''(d2;), 

«/ 0 J St 

hence, by Proposition of Appendix, for the convergence in distribution 
lim [ mA+{f){p)l[o,t]{u)A^'’{dz), 

fe^ + OO 

= / Y] miuii,m,p)du, 


peN 


(2,m)G{0,l}xN 

nt 


= / 5 ]A+(/)(p) 


pgn 


Af A 2 ^ 

A^ -j- p p2 


l{p) d-u 


by Relation 0 of Proposition]^ of the Appendix. By convergence of the sequence 
(A^'^) this last expression can be expressed as 


0A+(/)(p) v^(p) du 

3^ A+ + AiP 


k^+00 yjQ 


A^ 


dist. 


+ + X^pY(.u) 


A+{f){PF{u)) 


du 


"C- du 


A]^ Xi P f(u) 


for the convergence in distribution. 

For 0 < s < t, the characterization of a Markov process as the solution of a 
martingale problem gives the relation 

E (/(^F W) - fiPp W) - J' XoM^{u)A+{f){P^(u)) du 

- [ P3PY'U')A'^{f){P^{u)) du 

J S 


= 0 , 


from which we deduce the identity 

/ i't 


K[f{PFit))-f{PF{s))- [ A3 A+(/)(Pf(r)) 

V Js pi+Pf(u) 


Pi + Pf{u) 

- [ /33 Pf(w)A“(/)(Pf(r)) du 


P's = 0 . 
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See Theorem 11.2.42 of Jacod and Shiryaev [15]. Consequently, a possible limit is the 
solution of the martingale problem associated to the birth and death process with 
birth rate {Px) and death rate {5x) and with initial state in po- One gets therefore 
the desired convergence in distribution of {P^{t)). The theorem is proved. □ 

There exist cases where the autoregulation is not achieved by the regulated 
protein but by a complex of this protein, e.g by a dimer (2 copies of the protein) 
or a tetramer (4 copies) to cite few examples. In order to handle such cases, it is 
necessary to add to the gene expression model, a preliminary step describing the 
reaction scheme of the complex formation based on the law of mass action. In 
general, the dynamics involved in the reaction scheme are (very) rapid compared 
to the other processes of the gene expression and leads, by a singular perturbation 
like argument, to represent in case of deterministic model the rate of production of 
mRNAs as a non-linear function of protein concentration. Furthermore, when the 
reaction scheme possesses suitable properties, a Hill like repression function could 
also be obtained. See Weiss [31] for details. In the stochastic context, that leads 
to introduce a suitable scaling factor in the dynamics of the complex formation 
and to extend the previous derivation in the previous theorem to Hill functions, 
X I—)■ a/{b + a:"), with order n greater than I. 

The next section analyzes, in this limiting regime, the fluctuations of the number 
of proteins at equilibrium. 


4. Fluctuations of the Number of Proteins 


This section is devoted to the analysis of the equilibrium of the asymptotic 
process (Ppit)) of Theorem describing the evolution of the number of proteins 
with feedback. We start with a classical result for birth and death processes. 


Proposition 3. The invariant distribution np of the birth and death process (Ppit)) 
of Theorem [^ is given by 


ttf{x) 


f {P2P3Y TT Pi 


a; e N, 


where pi = A)*" /X^ , pi = Xi/p-i for i = 1, 2 and Z is the normalization constant. 


The expression of tt/ is explicit but with a normalization constant which is not 
simple. The constant Z can be expressed in terms of hypergeometric functions. See 
Abramowitz and Stegun [1] for example. Even if we can get a numerical evaluation 
of the average and of the variance oinp, it is much more awkward to get some insight 
on the dependence of these quantities with respect to some of the parameters like 
P 2 or p 3 for example. In the following we give an asymptotic description of the 
ratio of the variance and the mean of the number of proteins at equilibrium when 
the value of the quantity P 1 P 2 P 3 is large. In a biological context the numerical 
value of this parameter is not always large but this limit results sheds some light 
on the qualitative behavior of the auto-regulation mechanism. See Corollary |4.I| 
for example. A Laplace method is in particular used to investigate the asymptotic 
behavior of the first two moments of 71 ^;’. 

Theorem shows that the distribution of the process {P{t)) at equilibrium is 
Poisson with parameter E(P(t)) = Xp=pip 2 P 3 /{l + pi). In particular, one has the 
relation vai{P{t)) = E(P(t)). In the rest of this section, we will be interested in 
the corresponding quantity for the feedback process. 
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(5) 


For 77 > 0 and p > 0, denote by Vp the probability distribution on N defined by 

1 k ^ 

1 


=i s n i “p (si 


i(?7 + i) 


where Zp is the normalization constant. It is easily seen that is Vp^p with 
P = P 1 P 2 P 3 and 77 = Pi - 1. 

Proposition 4. If, for p > 0 and p > — 1, Ap is a random variable with distribution 
defined by Equation if. then for the convergence in distribution 


= A/" ( 0 , 1 / 72 ) , 


lim 

p->+oo ,/d7 


where ap = (7^^ + 4p — p) /2 and A/” (0,1/72) , is a centered Gaussian random 
variable with variance 1/2. In particular, for the convergence in distribution, 

lim ^ = 1. 
p-l+oo y/p 

Proof. If 0 is a bounded function on K, denote 


+ 00 




P t—, 


/c=0 


k - [ap] 


exp I ^ log ( T 

i=\ap'\ 


((p + z) 


the definition of iZp ^ gives that 


( 6 ) 


E 


Ap - [op] 


Ap(/)) 

Ap(l) 


Fix (j) some continuous function with compact support on [—AToi Ko] for some Kq>0. 
Since Op is the solution of the equation ap{r]+ap)=p, a change of variable gives the 
relation 


Ap(/')=— ^ 


/CLq _\ ^/dp 


exp ^ log 




apiv + ap) 


(*+ rapl)(??+ [opl Pi) 


The uniform estimation 


^log 


^(p + ap) 


(*+ \ap)){v+ [opl +i) 


= / log 


oiv 


0 V(w+rapl)(?7+[Opl+7 


du + O 
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for all k € Z with |A:| < and the fact that (j) has a compact support give 

that the quantity Ap((^) is equivalent to 


: E^(- 
^ E ^ 


^ k=-lKoy/a^\ 

X exp 


exp 


log 


apiv+a-p 


(M+[ap])(? 7 +[ap] +m) 


dit 


kj 

V^log 


(m^o 


Again, with the uniform estimation 

ap(r\ + Up) 


V^log 


ap(?? + Qp) 
a■p^){■n+\ap\+u^) 


= -2u + 0 


du 


f* + 00 


dz;. 


Xu^p + rap])(?7 + [op] + 

for u in some fixed finite interval, one gets that 

[^ 0^1 / . s / fk/^ 

Ap((/')-— V X ( ) exp -2 / u du\ I Xiv)' 

With similar estimations for Ap(l) (which imply in fact the tightness of the random 
variables (Ap — [opj) / y/a(i) and Relation (§, the proposition is proved. □ 

Corollary 4.1 (Asymptotic Number of Proteins with Regulation). If Pp is a 
random variable with distribution TTp then, for the convergence in distribution 


(7) 


E(Pf) , p YaiiPp) 1 

lim - = 1 and hm -=- = 

P 2 P 3 -I + OO ylpipfPS P2P3->' + 00 E(Pf) 2 


Furthermore, for the convergence in distribution, 

Pp — Gp 


lim 

P2P3->' + 00 


= aa(o,i 

where Up = (^/(pi - 1)^ + dpipaPa - Pi + l) /2. 

The equivalent of Relation Q for the scaling of the classical model of protein 
production is 


E(P) = T^P2P3 

1 + Pi 


and 


var(P) 


= 1 . 


Pi - E(P) 

by Theorem it shows that a feedback mechanism reduces the variance of the 
number of proteins in this limiting regime by a factor 2 for the ratio of the second 
moment and the first moment. 


5. Discussion 

In this section, other aspects of regulation of protein production are discussed 
via simulations in a plausible biological context whose parameters are going to be 
defined. Simulation follows the models in Section 2.2 and simulates the variables 
Ip, Mp, and Pp, not their scaling limits. 
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5.1. Numerical Values of Biological Parameters. For the model with feed¬ 
back, there are six parameters to determine. By using the literature one can esti¬ 
mate the common orders of magnitude of these parameters in a biological context. 
We therefore propose a set of parameters corresponding to an “ordinary” gene. 

(1) Gene regulation. The parameter gives the rate at which a given protein 
reaches its own promoter. It has been shown that this motion combines 
a three-dimensional diffusion in the cytoplasm and one-dimensional sliding 
along the DNA, see Halford |12) . 

Experiments on the lac repressor, using live-cell single-molecule imaging 
techniques, show that this time is of the order of 5 min, see Li and Elf [20] 
and Hammar et al. m- For this reason we will take Aj = 3.3 x 10 ^ s ^. 

The parameter A^ can be quite variable, depending on the affinity of the 
protein to the DNA sequence, we set A^ = 1 s“^. 

(2) mRNAs. The lifetime of an mRNA is ^ — 4min, see Taniguchi et al. [HH] . 
When the gene expression is always active (corresponding to the case where 
our variable I remains equals to 1), there is an average of 2 messengers, 
that is to say A^^ = jm = 120 s which gives A 2 = 8.3 x 10“^ s“^. 

(3) Proteins. A doubling time for the cell of txji — 40 min gives a protein 
decay of around one hour. For this reason one takes /rs = log 2 /ti /2 = 
2.8 X 10“"* s“^ for the rate of protein decay. It is assumed that a give type 
of protein that is produced in p = 300 copies when the gene expression 
is always active. From one messenger, a protein should be produced in 
a duration of time of the order of A;^^ = m x pg ^/p which gives A 3 = 
4 X 10“^ s“^. 

These parameters may correspond to an “ordinary bacterial” gene: in a E. Coli 
genome of 4300 genes, there are around 3.6 x 10® proteins and 1.4 x 10® mRNAs 
per gene, see Table 1 of Chapter 3 of Neidhardt [23], the number of messengers and 
proteins is of the order of magnitude of our numerical estimation of the parameters. 

5.2. Impact of Autogenous Regulation on Gene Expression. We have com¬ 
pared two mechanisms: the classical model without regulation and the autogenous 
regulation process. The mean number of proteins is the same as well as the mean 
number of mRNAs produced E(M) = 'E{Mp). Parameters A]'' and A(" are adapted 
in the classical model to fulfill these conditions. The other parameters are as defined 
in the previous section. 

The comparison is shown in Figure]^ The mean number of proteins is 178, as can 
be seen that the curve for the autogenous regulation is slightly more concentrated 
around the mean but not that much. The values of the corresponding standard 
deviations are not really different •\/var(P)=42.2 and A/var(PF)=35.8. The impact 
of the autogenous regulation on the variability of the number of proteins is non¬ 
trivial but not really spectacular for the set of parameters associated to a “typical” 
gene. This is significantly less than the performances of the limiting mathematical 
model studied in Section]^ The main reason seems to be that the scaling parameter 
is not, in some cases, sufficiently large to have a reasonable accuracy with the limit 
given by the convergence result of Theorem 

5.3. The Limiting Scaling Regime as a Lower Bound. Roughly speaking. 
Theorem and Corollary |4.1| give that for N and p 2 Pz large, then the ratio 
var(P^)/E(Pjf) converges to 1/2. In Figure § one considered a simulation with 
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Figure 4. Simulations: Protein distribution with and without 
autogenous regulation with a fixed mean number of proteins of 


178. 


fixed product p 2 P^ = 71.43 with N varying. The interesting feature is that the 
ratio is decreasing with TV, this suggests that the variance of the limit of the scaling 
procedure should provide a lower bound for the variance of the real model. We 
have not been able to show rigorously this phenomenon. For N = 250, the value 
of the ratio var(P|f)/E(P^) = .7964 which is quite far from its limiting value 1/2 
given by Corollary |4.1[ This can be explained by the fact that the quantities N 
and P 2 P 3 are not very large. 


5.4. Regulation of the Production Process on mRNAs. The regulation on 
the gene has the effect of an ON/OFF mechanism. When the gene is active, it is 
producing mRNAs at full speed and no mRNA is produced when it is inactive. This 
suggests that the production of proteins follows roughly the same pattern: steady 
production rate at some instants and little is produced otherwise. This scheme 
can consequently increase the variability of the production process of proteins. A 
possible idea to reduce the variance due to the acti vat ion/inactivation of the gene 
is to transfer the activation/inactivation process at the level the mRNAs. This 
possibility is investigated in this section. Each mRNA can be inactivated by a 
protein at rate A 2 , in this state it cannot produce proteins. An inactivated mRNAs 
becomes active at rate A^. In this way the production process can, hopefully, 
be modulated more smoothly by playing on the inactivation of a fraction of the 
mRNAs. In this way at time t, if the number of active [resp. inactive] mRNAs 
is M{t) [resp. M*(t)], the process is Markov with transition 
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Figure 5. Simulations: Evolution of the ratio var(P^)/E(P^) as 
a function of N. 


rates, for {m,m*,p) S 

{ (m, m*,p) —)■ (m + 1, m*,p) at rate A 2 , 

(m, m*,p) —)■ (m — 1, m* + l,p) at rate X^mp, 

{m,m*,p) —)■ (m + l,m* — l,p) at rate A^m*, 

the other transitions are as before, active of inactive mRNAs die at rate ^2 and 
proteins are produced at rate X^m and die at rate 

To compare the two regulation processes, either on the gene or on mRNAs, 
simulations have been done with the following constraints: the average number of 
proteins is fixed around 1400. To have a fair comparison, we add the constraint 
that the number of mRNAs produced should be the same in all simulations. The 
numerical values have been estimated by using similar methods as in section but 
for this setting. Experiment (3) considers the case of an average lifetime of an 
mRNA of 40mn, if this is far from a “normal” biological setting, as it will be seen, 
this scenario has the advantage of stressing the importance of this parameter in 
this configuration. 

Numerical Values of Parameters. 

(1) Regulation on the gene. 



Ar 

A 2 

^J■2 

A 3 


0 .21” 

5’ 

12 ” 

4’ 

25” 

Ih. 


(2) Regulation on mRNAs (I). 

For this experiment, the expected lifetime of an mRNA is twice the corre¬ 
sponding value of case (1). 


A 2 

A 2 + 

A 2 


A 3 


23” 

2 ” 

45’ 

8 ’ 

25” 

Ih. 
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(3) Regulation on mRNAs (II). 

For this second experiment on the regulation of mRNAs, the expected life¬ 
time of an mRNA is 10 times than in case (1). 


A 2 


^^2 


A 3 


23.8” 

2” 

45’ 

40’ 

25” 

Ih. 


Results of the Experiments. Table [T] shows that the mean number of mRNAs 
produced per unit of time is essentially the same in all experiments as well as 
the mean number of active mRNAs. It should be noted the impact of regulation 
on mRNAs for the standard deviation of the number of proteins when the mean 
life time is 8mn is not really significant (10% gain) than the regulation on the 
gene. When the mean lifetime is 40mn the improvement, 36%, of the standard 
deviation becomes significant, showing that in this case the production process is 
“smoothed” by this mechanism. The three distributions of the number of proteins 
of these experiments are presented in Figure 


Regulation on 

Gene 

mRNAs/8nm 

mRNAs/40mn 

Mean Nb of mRNAs 

10.33 

19.74 

99.04 

Mean Nb of Active mRNAs 

10.33 

9.77 

9.81 

Mean Nb of Proteins 

1403.63 

1400.29 

1403.36 

Standard Deviation of Nb of Proteins 

92.66 

84.22 

59.04 


Table 1 . Comparison of Regulation Processes on Gene or on mR¬ 
NAs with Different Lifetimes 



Figure 6. Simulations: Probability Distribution of the Number 
of Proteins with Regulation on Gene or on mRNAs, is the 
average lifetime of an mRNA. The average number of proteins is 
1400. 
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5.5. Impact of Feedback on Frequency. In this section, we study the nature of 
the fluctuations of the number of proteins at equilibrium from the point of view of 
signal processing or automatic control. The aim of a feedback is often of changing 
the nature of the signal, attenuating disturbances by reducing, for instance, high 
frequencies. In these cases, spectral analysis gives a characterization of the nature 
of changes. 

By analogy, we consider our model as a system that has to achieve a command 
(the production of a given mean number of proteins) and where the resulting signal 
P{t) or (Ppit) is altered by some noise. In this framework, one can study if the 
effect of the feedback has an impact on the signal, by rejection of some frequency 
ranges. 

To do so, consider the signals {P{t)) and of two simulations with or 

without autogenous regulation. The analysis of these signals is done by estimating 
the power spectral density, that describes the spectral characteristics of stochastic 
process. We estimate the power spectral density for each signal, using classical 
estimator of smoothed periodogram. See George et al. [10] and Chapter 10 of 
Miller et al. |22| for example. 

The result is shown in Figure Both spectra seem to represent a low-pass 
filter with a cut off frequency in the order of magnitude of the dilution factor 
/i 3 = 2.8 X The two power spectral densities do not seem to exhibit 

significant differences. The feedback has therefore no noticeable effect in terms of 
reduction of frequency disturbances. 





Figure 7. Power spectral density estimation of signals with and 
without regulation 


5.6. Versatility of the Protein Production Process. This section is devoted 
to the impact of autogenous regulation on another aspect of protein production. Up 
to now, we have considered the production process of proteins at equilibrium, by 
assuming that the production rate of a given protein has to be fixed. It may happen 
nevertheless that, due to an external stress, such as antibiotics, DNA damage by 
UV, see Camas et al. [Bj, or nutriment absorption, see Schleif |33|, the cell has to 
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change rapidly its production rate to quickly produce a large amount of proteins for 
example. The affinity of the transcription factor for the promoter of the gene can 
be adapted for that purpose. Conversely, when the external stress disappears, the 
production of the protein has to be quickly reduced to minimize the consumption 
of resources. 

We consider the situation when the two production processes, with and without 
autogenous regulation, give the same average output of proteins at equilibrium. 
Two cases are investigated: when the initial number of proteins is below the value 
equilibrium, see Figure]^ or above this value, see Figure]^ As it can be seen, the 
autogenous production process converges more rapidly to equilibrium in both cases. 
Our simulations show that when the initial value is 290, the autogenous production 
process is 40% faster than the process without feedback to reach the level 1300 (the 
equilibrium is at 1400 in this case). A similar result holds in the other case. 

These interesting properties are related to the modulation of the gene activity. 
In the experiment of Figure for the autogenous process the rate of activity of 
the gene is of the order of 50% at the beginning and it is only of the order of 
0.1 later at equilibrium. Without regulation this rate is constant throughout the 
simulation. This explains the “fast start” of the autogenous process. An analogous 
explanation holds for the experiment of Figure in the autogenous process. The 
gene is rapidly switched off due to the large number of proteins, thereby decreasing 
rapidly the number of proteins. This is consistent with experiments described in 
Camas et al. [6] and especially Rosenfeld et al. [31] where the improvement has 
been estimated at 80% in some cases. 



Figure 8. Simulations: Evolution of the Mean Number of Pro¬ 
teins: Initial Point at 290, equilibrium at 1400. Time scale: xlO"*^ 
sec. 
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Figure 9. Simulations: Evolution of the Mean Number of Pro¬ 
teins: Initial Point at 1400, equilibrium at 290. Time scale: xlO^ 
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Appendix A. Convergence Results 


We first introduce some notations that will be used throughout this section. 

A.l. Evolution Equations. We will use the Skorohod’s topology for convergence 
in distribution in the space T],IR+) of cadlag processes. See Chapter 3 of 
Billingsley [1] for example. To simplify the presentation, all our processes will be 
defined on the same probability space in the following way. 

Let , J\f~, i=l, 2, 3 be independent Poisson processes on with rate 1 
defined on a probability space (fl, P). If A S is a Borelian subset of 

and {i, c) € {1, 2, 3} x {+, —}, Ni{A) denotes the number of points of the process 
in the subset A. For t > 0, one denotes by Xf the tr-field generated by the 
random variables 


M^{B X [0,t]) for B e S(K+) and (i, c) e {1, 2, 3}x {+, —}. 


It is easily seen that the process [Xp (t)) has the same distribution as the solution 
of the following stochastic differential equations (SDE) 



(9) AM^(t) = l|,^(,_)^,|Af+([0,A2fV] X [At]) x [dt]) 

(10) dP^{t)=J^+{[0,X3M^{t-)] X [dt]) - Af3-([0,/r3Piy(t-)] x [dt]) 


with the same initial condition. For any > 1, {Xp{t)) is a Markov process 
adapted to the filtration (P*). These SDE can be rewritten as, for some function / 
with finite support on S, 



( 11 ) 


+ [ X 2 NI^{u)A+{f){X^{u)) dM+ [ fi 2 NM^{u)A 2 {f){X^{u)) dn 
Jo Jo 

+ f X3M^iu)A+if)iX^{u)) du+ l\:iP^{u)A^{f){X^{u)) du + Wf{t) 
Jo Jo 


where, for x = (i,m,p) € S, the operators A'^^ are defined by 


Ai(/)(ai) = fO-i,m,p)-f{x) 

Aj(/)(a:) = f{i,m + l,p)-f{x), A^(/)(ai) = f{i,m-l,p)-f{x) 

^tif){x) = f{i,m,p+l)-f{x), A^{f){x) = f{i,m,p-l)-f{x) 
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and (Wj (t)) is a local martingale whose previsible increasing process is given by 

(12) (Vbf)(t)= f [\tN{l-I^{u))+\^NP^{u)I^iu)] [Ai(/)(X^(m))]" du 

Jo 

+ f X 2 NI^{u) [AtimX^iu))]" du+ f fi 2 NM^{u) [A^{f)iX^du 
Jo Jo 

+ fxoM^iu) [A+(/)(A|'(n))]"d«+ ffisP^iu) [A^{f){X^{u)f du. 
Jo Jo 


See Rogers and Williams [30] for example. 

De 

by 


_ JSf _ N 

Definition!. Let {M {t),P {t)) be the Markov process with transition rates given 


(13) 


{m,p) —)■ (m + l,p) at rate X 2 N, {m,p) —>■ (m — l,p) ” /r 2 mA, 

(m,p) ^ (m,p+1) ” A 3 TO, (m,p) ^ (m,p-1) ” ytsp 


_JV _ N 

and initial state {M (0),P (0)) = (mo,po)- 

The process (M (t), P (t)) is simply the analogue of our process {Mff {t),Pp (<)) 
when the gene is always active. 

Lemma 1. (1) For the convergence in distribution for the uniform norm on 

compact sets 

ft 


lim 

N—>--\-oo 


tN 


M (u) drt = {p2t). 


'0 


(2) For T>0, 




sup E sup P (t) I < + 00 . 

N>r \o<t<T 




Proof. From Relations (13), it is easily seen that the process (M (t)) can be ex¬ 
pressed {Li{Nt)) where (Lr(t)) is an M/M /00 queue with arrival rate A 2 and service 
rate 112 with Li(0) = mo. See Chapter 6 of Robert [29] for example. Elementary 
stochastic calculus gives, for t > 0, 

pNt 

(14) Li{Nt) = mo + X 2 Nt — fj ,2 / Li{u) du + {t), 

Jo 

where (t)) is a local martingale whose previsible increasing process is given by 

fNt 

) (t) = X 2 Nt + p 2 Li (u) dw. 

Jo 

Doobs’Inequality shows that the process {X4^{t)/N) vanishes for the convergence 
in distribution as N gets large. 

For e > 0 and a; G N, if 

Tj, = inf{t > 0 : Li{u) > a:}. 

Proposition 6.10 of Robert [5S] shows the convergence in distribution 
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where Eq is an exponential random variable with parameter fj ,2 exjp(—P 2 )■ This 
shows in particular the process {Li{Nt)/N) converges in distribution to 0 for the 
uniform convergence on compact intervals since 



Li{Nt) 

N 



< P (TlsAtj < NT) . 


From Equation (14), one gets 

pNt 


fO 


M^{u) du = ^ 


Li(u) du 


'0 


1 

P2t H- 



L,{Nt) 

N N j 


and therefore assertion 1 ) of the lemma. 

For the last assertion, the method is similar: one first write the evolution equa¬ 
tion 

pt pt 

(t) = pq-\- X 3 / {u) du- ps / (u) du + M 2 (t), 

Jo Jo 

where (^)) is a local martingale whose previsible increasing process is given by 


{M 2 ) (t) = X 3 f M^{u) du + p .3 
Jo 


P^ {u) dn. 






Define P^ {t)= sup{P {u) \ d <u<t}^ then for 0 < t < T 


(15) E (pf (t)) < po + A 3 E U du 


-l-E( sup \M 2 {u)\] T ps I E 

\0<iu<.t 


{p1 («)) 


du. 


Doob’s Inequality gives, for t <T, 


e(^ sup |Mf(u)|^ dM + 2 p 3 ^ E(pf(u)) du, 


N , 


and from the ergodic theorem for {Li(t)) (recall that M {t) = Li{Nt)) one gets 


lim E 

N—>--\-oo 


irN 


M (u) du = psT. 


One concludes by using Equation (151 and Gronwall’s Lemma. 


□ 


Proposition 5. The sequence {Pp (t)) is tight for the convergence in distribution 
of cddldg processes. 


Proof. Aldous’ criterion for tightness is used. See Theorem 4.5 page 320 of Jacod 
and Shiryaev [T3] for example. For P > 0, one denotes by 7t the set of stopping 
times associated to the filtration (Pt) which are bounded by T. For ry > 0, let ti, 
T 2 € 7t be such that ri < T 2 < Ti+ry. The respective probabilities that, on the time 
interval [ti , T 2 ], no protein is made or that no protein is degraded are respectively 
given by 
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By using the strong Markov property, one gets the relation 
P - Pp (t- 2 )| > 1) < 1 - E (^exp ^ (u) du 

(exp(-«£ 


+ 1 -E 


PP (u) du 


With a simple coupling using the same Poisson processes of Equations ([^ 

and (10 1 gives a process as in Definition on the same probability space such that 
the relations (t) < (t) and PP {t) < P^(t) hold almost surely for all t > 0 . 

From the last relation, one gets the inequality 

E {\Pf (n) - Pf (r 2 )| > 1 ) < 1 - E (^exp (^-Ag ^ m""( u) dw^ 

+ 1 — E [ exp [ —^377 sup P^ {t) 

V \ 0<t<T 


< 1 — E I exp ( —A 3 sup / 
0<t<T Jt 




(u) dw 




+ 1 — E exp —71377 sup P (t) 

0<t<T 


Lemma gives the relation 

lim sup E f exp f-Aa [ (u) du^^ = 

IV^+oo^^gTy \ \ Jti )) 

and, for e > 0 , the existence of -ff > 0 such that 


-FiAf X 


sup P sup P (t) > Pi I < £. 

IV>1 \0<t<T 


Consequently 


lim lim sup I 

ri-^ON ^+00 

ri<<r2<Ti+?7 


(|Pi^(ri)-P;^(r2)|>l)=0, 


hence, by Aldous’ criterion, the tightness of the sequence {PP (f)) is established. 
The proposition is proved. 

□ 


A.2. Convergence of Occupation Measures. For A^ > 1 and T > 0, one defines 
the random measure A^ on £t '^=’ { 0 , 1 } x x [ 0 , T] as follows, for a non-negative 
Borelian function G on £t, 

A^{G)= [ GiXP{u),u) du. 

Jo 

If A is a Borelian subset of £t, A'^(A) denotes A^(l^). 

Proposition 6. The sequence of random measures is tight and any of its 
limiting points A can be written as 

j-T 

HF) = 1 : / G{ijm,Pju)7Tp{i^m)i'u{p) du. 
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where, for any u < T, Vu is a positive measure on N such that, almost surely, 

p„(N) du = t, yt < T, 

and, forp G N, Tip is the invariant distribution of the Markov process on {0,1} x N 
whose transition rates are given by, for (i,m) S {0,1} x N, 



(16) 


{ {i, m) —>■ (1 — m) 
(*, m) —>■(*, m + 1) 
(*, to) —>■ {i,m— 1) 


at rate p{l — i), 

A2*, 

fi2m. 


Additionally, one has 

(17) E mTTpii, to) 

(i,m)G{0,l}xN 


A^ A2 
A(*" + A]^ p /^2 


Proof. For K>0, if ICk is the compact subset {0, l}x [0, i^J^x [0, T] of £t, then 

E (A^ (StXJCk)) < / P {M^(u) >K)du + TF( sup (u) < k] . 

Jo \0<u<T ) 

By using the same coupling as in the proof of Proposition one gets that 

E (A^ (£’t\A:k)) < / pfM^(w) > a:) dit + TP f sup P^{u)<K 
Jo ^ ' \o<u<T 

By Lemmafor e > 0, there exists some K such that 

sup E (A^ {£t\ICk)) < £■ 

A>1 


Consequently, the sequence (A^) of random Radon measures on £t is tight. See 
Dawson [3 Lemma 3.28, page 44] for example. 

Let A be a limiting point of some subsequence (A'^'‘(-)). By using Radon- 
Nikodym’s Theorem, see Chapter 8 of Rudin [52] for example, it is not difficult 
to see that there exists some non-negative random variables (£u(x)(cv), (cv,x,u) G 
D X iS X [0, r]) such that (w, x, u) i-)- £u(x)(uj) is measurable and A can be expressed 
as 

*(0) = E/ G(x, u)£u{x) du. 

From the domination relation of Lemmaone gets that, almost surely, there is no 
loss of mass, i.e. 


(18) 


f £u(,S) dw = t, 
Jo 


yt < T, 


holds almost surely. Now take a function / with bounded support on S, by using 
Equation (121, it is not difficult to show that the process {{W^){t)) satisfies the 
relation 


by Doob’s Inequality this implies that the martingale {W^{t)/N) converges in 
distribution to 0 for the uniform norm on [0,T]. 
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By dividing Relation (11) by N, one gets that, for the convergence in distribution, 
the relation 


lim 

N—¥-\-00 


A+(1-/^(u))Ai(/)(X^(r)) du 


+ [ XiP^{u)I^iu)A,{f){X^{u)) du 
Jo 

+ j\ 2 l^iu)A+if)iX^{u)) du+ j\ 2 M^iu)Af if)ix^iu)) dM^ =0. 
holds. The convergence of the sequence (A'^'=) gives that the relation 


E 




A^‘(l-i)Ai(/)(ai)^„(a;) dM+ / A^ ipAi(/)(a:)4(a;) du 

Jo 

+ [ X 2 iA^if)ix)iuix) du+ f fi 2 mAf if)ix)£uix) du = 0 
Jo Jo 


>0 Jo 

holds almost surely for all 0 < t < T and for all indicator functions of elements 
S. Now, for p e N and g a function with finite support on {0,1} x N, define 
fii,m,p) = gii,m), the above relation gives 

(19) ^ £uii,m,p)X^il-i)Aiig)ii,m)+£uii,rn,p)XfipAiig)ii,m) 

x=(i,m,p)^S 

+ £u{h p)X2iA+ig)ii, m) + iuii, m,p)p2mA2 (g)(j, to) = 0 

holds almost surely for all u € Al C [0,T] and [0,T]—Al is negligible for Lebesgue 
measure. Relation (191 shows that for u € A, the vector i£uih Tn,p)) is proportional 
to the invariant distribution Tip of the Markov process on {0,1} x N whose transition 
rates are given by Relations @. 

One gets therefore the existence of a constant u„(p) such that £uii,m,p) = 
^'ti(p)'Tp(7 ,to) for all ii,m,p) G S. Equation (18l gives the relation 

f Uu(N) du = t, Vt < T. 

Jo 

Hence one has u„(N) = 1 almost surely for all u e Ali C [0,T] and [0,T]—Ali is 
negligible for Lebesgue measure. 

Straightforward calculations as in the proof of Proposition complete the proof 
of the proposition to give Relation (1T7|. □ 
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